RDFPRO: an extensible tool for building stream-oriented RDF processing pipelines
نویسندگان
چکیده
We present RDFPRO (RDF Processor), an open source Java command line tool and embeddable library that offers a suite of stream-oriented, highly optimized processors for common tasks such as data filtering, RDFS inference, smushing and statistics extraction. RDFPRO processors are extensible by users and can be freely composed to form complex pipelines to efficiently process RDF data in one or more passes. We show how RDFPRO model and multi-threaded design allow processing billions of triples in few hours in a typical Linked Open Data integration scenario, and discuss relevant implementation aspects and lessons learnt.
منابع مشابه
Demonstrating the Power of Streaming and Sorting for Non-distributed RDF Processing: RDFpro
We demonstrate RDFpro (RDF Processor), an extensible, generalpurpose, open source tool for processing large RDF datasets on a commodity machine leveraging streaming and sorting techniques. RDFpro provides out-of-thebox implementations – called processors – of common tasks such as data filtering, rule-based inference, smushing, and statistics extraction, as well as easy ways to add new processor...
متن کاملChange-Resilient Design and Dataflow Optimization for Distributed XML Stream Processors
We propose a new stream-processing framework based on a virtual assembly line (val) model. We instantiate the val framework obtaining ∆-XML, an approach for designing and optimizing distributed XML processing pipelines. val/∆-XML greatly simplifies the design of change-resilient dataflow pipelines: XML processors (called actors) can be inserted, deleted, and their “scope of work” (the parts of ...
متن کاملImplementation and Experiments of an Extensible Parallel Processing System Supporting User Defined Database Operations
This paper presents an implementation method and experimental results of an extensible parallel processing system for databases. We have already proposed a stream-oriented parallel processing scheme (stream-oriented ncheme) of basic operations for databases and knowledge bases. This scheme is based on the demand-driven evaluation incorporating stream processing. We have designed basic primitive...
متن کاملIntegrating Xml and Rdf Concepts to Achieve Automation within a Tactical Knowledge Management Environment
Since the advent of Naval Warfare, Tactical Knowledge Management (KM) has been critical to the success of the On Scene Commander. Today’s Tactical Knowledge Manager typically operates in a high stressed environment with a multitude of knowledge sources including detailed sensor deployment plans, rules of engagement contingencies, and weapon delivery assignments. However the WarFighter has place...
متن کاملStudy on Baseflow Separation of "Abolabas River” Using ADUKIH and RDF Methods
Objective: Currently, the evaluation of baseflow components have been of a worldwide concern due to the influential role of streamflow (base flow and direct flow)in agriculture, water sources management as well as supplying the potable water. Direct and field measurement of baseflow is not practicable especially in large areas with statistics deficiencies. Also, this would not be economically e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014